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NOTES AND LITEEATUBE 

AN OUTLINE OF CURRENT PROGRESS IN THE THEORY 
OP CORRELATION AND CONTINGENCY 

Workers in the physical sciences realized long ago that cer- 
tain progress depended upon the precision of their instruments 
of measurement and the adequacy of their methods of mathe- 
matical description and analysis. Biologists, here and there, are 
beginning to see the importance of the analytical as well as of the 
observational tools. Among the analytical formula none are 
of greater usefulness than those for measuring interdependence. 
It may not be out of place, therefore, to sketch in simple terms 
for the benefit of those who are interested in the methods only as 
a means to an end, the progress which is being made in the per- 
fection of these instruments of research. 

The term current as used in these paragraphs is made more 
comprehensive than is conventional; some of the citations are 
four or even more years old. The elasticity of the term is justi- 
fied in dealing with the literature of a field in which progress is 
particularly difficult and in which actual contributions are incor- 
porated but slowly into the working technique of the biologist. 
Indeed, biologists as a class still think of correlation as synony- 
mous with the classical product-moment method. How erroneous 
this impression is will appear in the following pages. 

The purpose of this review is therefore to indicate in non- 
mathematical terms easily comprehensible to biological readers 
the lines of advances in the theory of the measurement of inter- 
dependence in order that they may the more easily select for 
dealing with their actual data, formulas of the existence of 
which they might otherwise be unaware. 

The progress which we have to consider has been along four 
different lines: 

(a) In the simplification of methods of computation in the 
case of familiar formulas, (b) In the development of entirely 
new formulas applicable to data of particular sorts, (c) In the 
determination of the corrections to be applied for grouping into 
" broad categories." (d) In partial correlation, multiple cor- 
relation, and the correlation of indices and increments. 
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In this review we shall limit ourselves strictly to an outline of 
progress which has been made in the theory of the measurement 
of the interrelationship of two variates, leaving for considera- 
tion at a later time the far more complex subjects of correction, 
for grouping, partial and multiple correlation, variate differ- 
ence correlation and some other topics. 

The detailed advances may be most easily understood by con- 
sidering the kinds of data with which one has to deal in deter- 
mining the degree of interdependence, association or correlation 
(to use these terms in a broad sense) between two variates. 

An arrangement of the literature according to a key similar to 
that familiar in taxonomic works will perhaps be of service to 
the biologist who desires to locate at once the literature perti- 
nent to the particular kind of data with which he has to deal. 

Suppose first of all that the two characters are both suitable 
for measurement (or counting) on a quantitative scale and that 
for both the measurements form several classes. The choice of 
methods for measuring the correlation between them will then 
depend upon whether the average values of the y character as- 
sociated with serially arranged values of the x character lie in 
sensibly a straight line or whether they can best be represented 
by some more complex curve. Linearity of regression, as it is 
technically called, has therefore a two-fold significance, (a) 
Biologically, it shows that an associated character changes at a 
uniform rate (however slight this rate may be) with the varia- 
tion of a selected character. (5) Statistically, it justifies the 
application of the familiar product-moment method of determin- 
ing the correlation coefficient. 

Both Characters Measurable on a Quantitative Scale, Regres- 
sion Linear. — So satisfactory has the product-moment method 
proved for data in which both characters are measurable and 
regression is sensibly linear, that no fundamental advance has 
been made for several years. Boas's 1 first formula is, as pointed 
out by Pearson, 2 merely another form of the difference method, 
which has been in use for many years. 

Several modifications of a purely technical nature which facili- 
tate calculation or are useful in special cases have been pub- 

i Boas, F., ' ' Determination of the Coefficient of Correlation, ' ' Science, 
N. S., 29: 823-824. 1909. 

2 Pearson, K., ' ' Determination of the Coefficient of Correlation, ' ' Science, 
N. S., SO: 23-25, 1909. 
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lished. Pearson 3 has given a new approximate difference method 
which is serviceable in special cases only. Harris 4 has suggested 
a novel difference method for exact work with tables. An alter- 
native method of calculating rough moments and product mo- 
ments, given by Elderton, 5 seems to have attracted little atten- 
tion, although it has certain advantages for use in adding-machine 
computations. A product moment method which possesses 
marked advantages for use with machines which allow of simul- 
taneous multiplication and summation, and which obtains inci- 
dentally the data necessary for testing linearity of regression or 
computing the correlation ratio, -q, is now available." In the 
special cases in which the two characters to be centered in the 
correlation table are not differentiated, e. g., stature of pairs of 
brothers, length of Paramecium, etc., the tables are ordinarily 
rendered symmetrical by using each individual once as the x and 
once as the y member of the pair. This may be done by actually 
forming the symmetrical table, or by using the simple formula 
proposed by Jennings. 7 If, as is frequently the case, more than 
a single pair of individuals are associated, the labor of forming 
tables becomes very great. Bach individual of a family, each 
organ of an individual, or each individual measured from a par- 
ticular environment, must then be entered in the table in com- 
bination with every other one. Since the number of combina- 
tions in each class is n(n — 1) and the number of classes must 
be at least moderately large, the total number of combinations is 
very great. Thus the data for number of nipples in swine 
recently published by Parker and Bullard 8 require a table of 
34,884 combinations to determine the fraternal correlation for 
number of nipples. In the case of the Hydra data analyzed by 
Lashley, 9 tables with from one to nearly two hundred thousand 

s Pearson, K., "On Further Methods of Determining Correlation," 
Drapers' Company Eeseareh Mem., Biom. Ser., IV, Dulan and Co., 1907. 

* Harris, J. Arthur, "A Short Method of Calculating the Coefficient of 
Correlation in the Case of Integral Variates, ' ' Biometrilca, 7 : 214-218, 1909. 

s Elderton, W. P., "An Alternative Method of Calculating the Hough 
Moments from the Actual Statistics," Biometrika, 4: 374-378, 1905. Also 
in his "Frequency Curves and Correlation." 

« Harris, J. Arthur, ' ' The Arithmetic of the Product Moment Method of 
Calculating the Coefficient of Correlation, ' ' Amer. Nat., 44 : 693-699, 1910. 

t Jennings, H. S., "Computing Correlation in Cases "Where Symmetrical 
Tables are Commonly Used," Amer. Nat., 45: 123-128, 1911. 

s Parker, G. H., and C. Bullard, Proc. Amer. Acad. Arts and Science, 49 : 
399-426, 1913. 

» Lashley, K. S., Jour. Exp. Zool., 19 ; 210, 1915. 
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combinations are given. Methods for the rapid formation of 
symmetrical tables from which either correlation or contingency 
coefficients may be calculated 10 and for the formation of con- 
densed tables from which correlation coefficients 11 only may be 
deduced greatly reduce the necessary labor in such cases. For 
the testing of linearity of regression in the case of these intra- 
class and inter-class correlations, tables are essential. The use of 
such coefficients would, however, be greatly facilitated if calcula- 
tion could be carried out directly from moments computed from 
the classes themselves. Harris 12 has given an exhaustive series 
of formulae by which this can be accomplished, with examples 
showing the wide applicability of such coefficients. For example, 
these formulas fulfil more adequately the purpose of Boas's 
second formula (loc. cit.). 

These intra-class correlation formula? have been thrown into a 
form suitable for measuring substratum heterogeneity in experi- 
mental cultures. 13 

If the x and y character of a pair are differentiated, spurious 
values of the correlation coefficient must result from the render- 
ing symmetrical of the correlation surface. Pearson many years 
ago recognized the difficulty in dealing with groups in which 
there is orderly differentiation due, for example, to growth. 14 
Attention has recently been directed 15 to difficulties arising when 
differentiation within the class may exist, but it may be difficult 
or impossible to arrange the individuals by any character out- 
side of themselves to obtain the constants necessary for deter- 
mining the true correlation from the spurious values deduced 

io Harris, J. Arthur, ' ' On the Formation of Correlation and Contingency 
Tables when the Number of Combinations is Large," Amer. Nat., 45: 566- 
571, 1911. 

11 Harris, J. Arthur, ' ' The Formation of Condensed Correlation Tables 
when the Number of Combinations is Large," Amer. Nat., 46: 477-486, 
1912. 

12 Harris, J. Arthur, "On the Calculation of Intra-class and Inter-class 
Coefficients of Correlation from Class Moments when the Number of Pos- 
sible Combinations is Large," Biometrika, 9: 446-472, 1913. 

is Harris, J. Arthur, " On a Criterion of Substratum Homogeneity or 
Heterogeneity in Field Experiments," Amer. Nat., 49: 430-454, 1915. 

14 Pearson, IC, ' ' On Homotyposis in Homologous but Differentiated Or- 
gans," Proc. Boy. Soc. Lond., 71 : 288-313, 1903, 

is Harris, J. Arthur, ' ' On Spurious Values of Intra-Class Correlation 
Coefficients Arising from Disorderly Differentiation within the Classes," 
Biometrika, 10: 412-416, 1914. 
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from the tables. Whether the methods used in such cases by 
Harris 16 will prove the best available remains to be seen. 

Considerable attention has recently been given to the probable 
error of the correlation coefficient. 

If the number of observations upon which r is based is large 
and if it does not approach too closely either of its limiting 
values of +1 or — 1, the use of the formula of Pearson and 
Filon, 

1 — r 2 



E r = -6745 ■ 



rn 



readily evaluated by the use of the tables of 1 — r 3 given by 
Soper 17 used in connection with the x 1} of Miss Gibson's Tables 18 
or approximated by the Abac of Heron, 19 is quite legitimate. 
But when either of these conditions is not realized the value of r 
found from a single sample will probably not be the true corre- 
lation for the population under consideration. 

Chemists, agriculturists, physiologists and many others often 
must necessarily reason from a relatively small number of obser- 
vations. It is therefore of very real importance that some valid 
measure of the statistical trustworthiness of such coefficients be 
known. Some of the problems concerning the probable error of r 
when it approaches its numerical limits or when the number of 
cases upon which it is based is small are discussed mathematically 
by Soper 20 as they have been attacked experimentally by "Stu- 
dent. ' ' 21 Further contributions to the subject are those of Fisher 22 
and of Pearson, 23 who summarizes the series of studies and gives a 
table to facilitate the interpretation of correlation coefficients 
based on small samples. He says: 

18 Harris, J. Arthur, "On the Significance of Variety Tests," Science, 
N. S., 36: 318-320, 1912, and BiometriJca, I. c. 

17 Soper, H. E., In "Tables for Statisticians and Bionietricians. " 

is Biometrika, 4: 385-3S2, 1906. Also in Pearson's Tables. 

i° Heron, D., ' ' An Abac for Determining the Probable Errors of Corre- 
lation Coefficients," Biometrika, 7: 411, 1910. Also in Pearson's Tables. 

20 Soper, H. E., ' ' On the Probable Error of a Correlation Coefficient to a 
Second Approximation," Biometrika, 9: 91-115, 1913. 

21 ' ' Student, " " Probable Error of a Correlation Coefficient, ' ' Biometrika, 
6: 302-310, 1908. 

22 Fisher, E. A., "Frequency Distribution of the Values of the Correla- 
tion Coefficient in Samples from an Indefinitely Large Population," Bio- 
metrika, 10: 507-521, 1915. 

23 Pearson, K., ' ' On the Distribution of Small Samples ' ' ; Appendix I to 
papers by "Student" and E. A. Fisher, Biometrika, 10: 522-529, 1915. 
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We think it must be concluded that for samples of 50 the usual theory 
of the probable error of the standard deviation holds satisfactorily, and 
that to apply it for the case of n = 25 would not lead to any error which 
would be of importance in the majority of statistical problems. 

The original papers should be read by those who are dealing 
with coefficients lying near the limits of the range of correlation, 
or who must work with small samples. Those who can by extra 
labor obtain larger series of data should do so, for no knowledge 
of the theory of the probable error can ever take the place of 
widened series of data, although it may be essential to the inter- 
pretation of constants based of necessity on a limited number of 
observations. 

Both Variates Measurable on a Quantitative Scale; Regression 
Non-Linear. — For cases in which the rate of change in the y char- 
acter can not'be described by a straight line, the proper measure 
of interdependence is Pearson's 24 correlation ratio, 17. The value 
of the correlation ratio is two-fold, (a) It furnishes a measure 
of the interdependence of two variates in cases in which the use 
of the correlation coefficient is not fully justified. (6) It affords 
a means of testing, by the use of Blakeman's criterion, 25 for 
linearity of regression. Thus in deciding between the correla- 
tion coefficient and the correlation ratio, the calculation of each 
of the constants may, in critical cases, be necessary. 

A further test of the goodness of fit of regression curves has 
also been given by Slutsky. 26 This method, which involves the 
well-known x 2 of Pearson's test for goodness of fit, should have 
wide usefulness. An illustration of its application has recently 
been given by Pearl. 27 

One Variate Describable in Multiple Categories, the other 
Measurable on a Quantitative Scale. — Such cases are occasionally 
met with in many fields of work. For example, one may desire 
to know in fractions of a scale ranging from to 1 the relation- 
ship between any describable but not measurable environmental 

24 Pearson, K., ' ' On the General Theory of Skew Correlation and Non- 
Linear Regression," Drapers' Co. Res. Mem., Biom. Ser., II, Dulan and Co., 
1905. 

25 Blakeman, J., ' ' On Tests for Linearity of Regression in Frequency 
Distributions, ' ' Biometrika, 4 : 332-350, 1905. 

26 Slutsky, E., "On the Criterion of the Goodness of Fit of Regression 
Lines and on the Best Method of Fitting them to the Data," Jour. Boy. 
Stat. 800., 77: 78-84, 1914. 

27 Pearl, R., ' ' An Important Contribution to Statistical Theory, ' ' Amee. 
Nat., 48: 505-507, 1914. 
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factor and any measurable characteristic of the organisms sub- 
jected to its influence. Or in testing the assertions of such wri- 
ters on criminology as Lombroso and Havelock Ellis against the 
results of actual measurements of criminals, one may find it 
desirable to correlate between the kind of -crime and any cephalic 
measurement. 

For the analysis of such data the correlation ratio may be of 
great service. 

One Character Alternative, the other Measurable on a Quanti- 
tative Scale. — Suppose now that one of the correlation ratio tables 
of the kind discussed in the foregoing paragraph were reduced, 
as far as the qualitatively appreciable but not measurable char- 
acter is concerned, to two classes only, while the measured variate 
remained as before. Such tables actually occur in practise with 
great frequency. For example, one may wish to correlate be- 
tween the form of a dimorphic crustacean and physical measure- 
ments. Or it may be desirable to ascertain the correlation be- 
tween type (tubular or ligulate) of a composite flower and the 
number of divisions in the corolla. Or one may wish to measure 
the relationship between type and time required for germination 
in the seeds of a dimorphic plant species. Or a series of individu- 
als may be classified by the social worker or prison warden as 
alcoholic and non-alcoholic and the investigator desires to corre- 
late between alcoholism (which is really a graduated character, 
although classified in the available records into the two alterna- 
tive classes only) and any physical measurement or the extent of 
criminality as measured by number of convictions or months 
spent in prison. 

In this reduced form the data can no longer be treated by the 
correlation ratio method, but must be attached by a recent for- 
mula due to Pearson, 28 and known as the Bi-serial correlation 
coefficient. 

Soper 29 has continued his work on the probable error by deter- 
mining the standard deviation of constants calculated by this 
formula. 

Both Characters Classified in Multiple Categories. — If instead 

28 Pearson, K., "Ona New Method of Determining Correlation Between 
a Measured Character A, and a Character B of Which Only the Percentage 
of Cases "Wherein B Exceeds (or Palls Short of) a Given Intensity is Re- 
corded for Each Grade of A," Biometriha, 7: 96-105, 1909. 

28 Soper, H. E., ' ' On the Probable Error of the Bi-serial Expression for 
the Correlation Coefficient," Biometriha, 10: 384-390, 1914. 
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of both characters being measurable on a quantitative scale, or 
one character recorded in a number of categories and the other 
measurable on a quantitative scale, both characters are not quan- 
titatively measurable, but describable in a number of classes only, 
neither the correlation coefficient nor the correlation ratio can be 
used. In such cases, which in practical work are very frequent, 
Pearson's contingency methods 30 must be used. These have been 
too long in use to require discussion or illustration here. Certain 
corrections to be applied will be considered at another time. 

The probable error of the contingency coefficient presents con- 
siderable difficulty. Those who have to deal with it should con- 
sult papers by Blakeman and Pearson 31 and by Pearson. 32 

One Variate Classified in Alternative, the Other in Multiple 
Categories. — Consider a contingency table reduced to a two-fold 
grouping for one of the characters, but retaining the multiple 
division for the other. Such a table is comparable with the con- 
densation of the correlation ratio table discussed above. It must 
be analyzed by a special method. 33 

The formula has not as yet had extensive practical application. 
It has been used to determine the relationship between alcoholism 
as an alternative character and type of crime classed in multiple 
categories, and between alcoholism in the parent and health of 
the children. It may prove especially valuable in dealing with 
the interrelationship of various teratological conditions in mor- 
phological work. 

Both Characters Classified in Alternative Categories Only. — As 
the extreme case we may think of a contingency table reduced 
to a two-fold grouping for each of the characters. This is then 
the four-fold table for alternative characters, i. e., (A) and (not 
-A), (B) and (not -B). 

In the past, two methods have been chiefly employed for obtain- 
ing constants from such tables, Pearson's four- fold correlation 
coefficient and Tule's coefficient of association. 

so Pearson, K., ' ' On the Theory of Contingency and its Eelation to Asso- 
ciation and Normal Correlation, ' ' Drapers ' Co. Res. Mem., Biom. Ser., I. 
Dnlan & Co., 1904. 

3i Blakeman, John, and K. Pearson, ' ' On the Probable Error of Mean 
Square Contingency," Biometrika, 5: 191-197, 1906. 

32 Pearson, K., ' ' On the Probable Error of a Coefficient of Mean Square 
Contingency," Biometrika, 10: 570-573, 1915. 

33 Pearson, K., " On a New Method of Determining Correlation when One 
Variable is Given in Alternative and the Other in Multiple Categories," 
Biometrika, 7: 248-257, 1909. 
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For several years critical workers have realized that very little 
reliance is to be placed upon Yule's very simple coefficient of 
association. This coefficient and another measure of correlation 
"the theoretical value of r" proposed in his "Introduction to 
the Theory of Statistics" have been discussed by Heron. 34 
Pearson and Heron 35 and Pearson 36 have gone into these methods 
and others proposed by Yule 37 in a masterly way. To discuss 
this memoir alone would require far more than the space avail- 
able for this general index of the correlation methods. Their 
treatment can leave no doubt — if any existed in the minds of 
those who have tried to use these formulae in serious statistical 
work— that except in very special cases all these association and 
colligation formulae are likely to work harm rather than to be of 
service in the hands of the biologist. 

This demonstration of the untrustworthiness of the various 
substitutes for the correlation coefficient practically throws us 
back upon the old four-fold method of Pearson, and upon another 
novel method to be discussed in a moment. The difficulty of com- 
putation has been one of the greatest obstacles in the way of the 
more general application of this method and has frequently 
resulted in the substitution of the less reliable coefficient of as- 
sociation. The necessary labor of calculation has been much 
reduced by two series of tables by Everitt. 38 

The determination of the probable error of the coefficient of 
correlation calculated from the four-fold grouping has always 
been excessively laborious. While four-fold correlations have 
been calculated in hundreds of cases, the determination of the 
probable error has been made for less than a hundred of the 
coefficients. Pearson 39 has now given tables to facilitate the cal- 

s* Heron, D., ' ' The Danger of Certain Formulas Suggested as Substitutes 
for the Correlation Coefficient," Biometrika, 8: 109-122, 1911. 

35 Pearson, K., and D. Heron, "On Theories of Association," Biometrika, 
9: 159-315, 1913. 

36 Pearson, K., "Note on the Surface of Constant Association," Bio- 
metrika, 9: 534-537, 1913. 

37 Yule, G. U., ' ' On the Methods of Measuring Association between Two 
Variates," Jour. Boy. Stat. Soc, 75: 579-641, 1912. 

38 Everitt, P. P., ' ' Tables of the Tetrachoric Functions for Pour-fold Cor- 
relation Tables," Biometrika, 7: 437-451, 1909; "Supplementary Tables 
for Finding the Correlation Coefficient from Tetrachoric Groupings," Bio- 
metrika, 8: 385-395, 1912. Also in "Tables for Statisticians and Bio- 
metricians. ' ' 

39 Pearson, K., ' ' On the Probable Error of a Coefficient of Correlation as 
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dilation of approximate probable errors which are sufficiently 
exact for all practical purposes. 

Finally, the most important recent development in the theory 
of correlation is probably Pearson's novel method of dealing with 
variates classed in alternate categories only. 40 

The fundamental conception of this method is exceedingly 
simple. Given the table, 
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where the large letters represent any alternative (e. g., Men- 
delian) characteristic of an individual, and the small letters 
denote the frequency of occurrence of the several possible combina- 
tions, it is clear that 

a+ c b + d a + b c+d 



A< 



A' 



N 



give the independent probabilities of the two pairs of character- 
istics. The four pertinent products of these ratios give the 
chances on the assumption of the independence of the two char- 
acters A and B, of the four possible combinations. Then if there 
be no correlation, within the limits of the errors of random 
sampling 



N 



a + c a + b 



= 0, 



and so on. The squares of the four differences between the ob- 
served frequencies, a, b, c, d, and those which would be expected 
if the two characters were really independent, gives the familiar 
X 2 of Pearson's test for goodness of fit. The significance of this 
test may be determined from Palin Elderton's tables, 41 and this 
is, in the case in hand, a measure of correlation. It has been 

Found from a Four-fold Table," Biometrilca, 9: 22-27, 1913. Also in 
Tables for Statisticians and Biometrieians. 

<*° Pearson, K., "On a Novel Method of Regarding the Association of 
Two Variates Classed Solely in Alternate Categories," Drapers' Co. Bes. 
Mem., Biom. Ser., VII, Dulan and Co., 1912. 

«• Elderton, W. P., ' ' Tables for Testing the Goodness of Fit of Theory 
to Observation," Biometrika, 1: 155-163, 1901. Also reprinted in Pear- 
son's volume of tables. 
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used as such during the past several years by some of us in prac- 
tical problems in which we found it impossible to place reliance 
upon Yule's coefficients and did not feel warranted, because of 
underlying assumptions, in depending solely upon the classical 
four-fold method. But it is a measure given in terms utterly in- 
comprehensible to the ordinary mind, which is quite incapable of 
thinking in millions or in multiples of millions. 

What Pearson has done with such brilliancy is to furnish a 
means in mathematical theory and working tables of passing from 
the incomprehensible scale of pure probability to the familiar and 
usable and widely comparable scale of correlation. 

As yet it is too soon to be able to state the results of extensive 
practical application of the new coefficients, but they should have 
wide usefulness. 

Both Characters Classified by Bank in Series. — In some cases, 
neither measurements nor classification of the individuals dealt 
with in categories are given in the data, but merely their position 
or rank in the series. 

Bank may be numerically expressed, and the suggestion has 
been made that the correlation of grades or ranks is a quite legiti- 
mate measure of interdependence in such cases. Pearson 42 has, 
however, pointed out the very real difficulties encountered in 
such work. Those who are tempted to use these methods should 
acquaint themselves with the dangers as pointed out in this 
memoir. 

One Variate Given by Bank in Series, the Other Measured on 
a Quantitative Scale. — Such cases are not likely to occur with 
great frequency in biological work. Possible instances are those 
in which one wishes to correlate between position in an intensity 
of pigmentation series and size or fertility — both quantitatively 
measurable characters. The formulae have been given by 
Pearson. 43 

One Variate Given by Multiple or Broad Categories, the Other 
by Bank in Series. — Practical applications in biology should be 
rare. For formulae see the paper by Pearson just cited. 

From the foregoing outline it must be clear that of recent 
years the conception of correlation has been greatly extended and 
the possibilities of the practical usefulness of correlation methods 

42 Pearson, K., "On Further Methods of Determining Correlation," 
Draper's Co. Res. Mem., Biom. Ser., IV, Dulan and Co., 1907. 

« Pearson, K., "On an Extension of the Method of Correlation by Grades 
or Ranks," Biometrika, 10: 416-418, 1914. 
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vastly increased by the deduction of formula? suitable for dealing 
with data of the most diverse sorts. 

The most valuable feature of a summary such as the present 
may possibly not lie in the fact that it exhibits to biologists the 
wide array of statistical tools which are now available for dealing 
with the most diverse sorts of data which they may collect, and 
shows where directions for their use may be found, but in the 
suggested warning that the hasty application of the first learned 
or the most easily calculated formula may lead to constants of 
little value. Most biologists can use a scalpel or a beaker with 
great success, but many at least would hesitate to try to handle 
without special training all the instruments which are to be seen 
in the surgeon's case or to use all the glassware on the organic 
chemist 's shelves. Each kind of tools require their special train- 
ing. Notwithstanding popular conceptions to the contrary, this 
is also true of the biometric tools. 

J. Arthur Harris 



